Detection of COVID-19 Case from Chest CT Images Using Deformable Deep Convolutional Neural Network

The infectious coronavirus disease (COVID-19) has become a great threat to global human health. Timely and rapid detection of COVID-19 cases is very crucial to control its spreading through isolation measures as well as for proper treatment. Though the real-time reverse transcription-polymerase chain reaction (RT-PCR) test is a widely used technique for COVID-19 infection, recent researches suggest chest computed tomography (CT)-based screening as an effective substitute in cases of time and availability limitations of RT-PCR. In consequence, deep learning-based COVID-19 detection from chest CT images is gaining momentum. Furthermore, visual analysis of data has enhanced the opportunities of maximizing the prediction performance in this big data and deep learning realm. In this article, we have proposed two separate deformable deep networks converting from the conventional convolutional neural network (CNN) and the state-of-the-art ResNet-50, to detect COVID-19 cases from chest CT images. The impact of the deformable concept has been observed through performance comparative analysis among the designed deformable and normal models, and it is found that the deformable models show better prediction results than their normal form. Furthermore, the proposed deformable ResNet-50 model shows better performance than the proposed deformable CNN model. The gradient class activation mapping (Grad-CAM) technique has been used to visualize and check the targeted regions' localization effort at the final convolutional layer and has been found excellent. Total 2481 chest CT images have been used to evaluate the performance of the proposed models with a train-valid-test data splitting ratio of 80 : 10 : 10 in random fashion. The proposed deformable ResNet-50 model achieved training accuracy of 99.5% and test accuracy of 97.6% with specificity of 98.5% and sensitivity of 96.5% which are satisfactory compared with related works. The comprehensive discussion demonstrates that the proposed deformable ResNet-50 model-based COVID-19 detection technique can be useful for clinical applications.


Introduction
A massive outbreak of novel coronavirus disease (COVID- 19) occurred in Wuhan, China, in December 2019, and it is causing a pandemic situation worldwide. According to the World Health Organization (WHO), around 476 million confrmed cases of COVID-19 including 6.1 million deaths were reported worldwide as of March 25, 2022 [1,2]. Te death rate is slightly less than 2%, but the main concern is the highly infectious nature of COVID-19 disease. Te diagnosis of COVID-19 and isolation of patients are the most critical parts to control this pandemic situation. Te mainstream diagnosis system is the real-time reverse transcriptionpolymerase chain reaction (RT-PCR) technique which is limitedly accessible to all hospitals and clinics. It also takes a long time to get the test results. Te nucleic acid amplifcation testing (NAAT) is another technique for COVID-19 diagnosis which is also time consuming and exhibits low preciseness as reported in [3]. Te chest imaging-based modalities such as X-ray (CXR) [4,5], computed tomography (CT) [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21], and ultrasound imaging [22,23] are becoming popular alternatives to the pathological tests for not only accurate screening of COVID-19 cases but also for predicting the severity of the disease. Furthermore, recent studies show the promises of the medical image-based IoT healthcare framework for COVID-19 detection and social isolation suggestions through digital surveillance to deaccelerate the COVID spread [24][25][26][27]. Since a large amount of private information of patients is gathered in the IoT healthcare system for data fusion in COVID detection, a secured and protected system should be established for virtual medical facilities [28][29][30][31]. It is inevitable that the computer diagnosis is becoming an obvious and demanding support to the medical experts for proper diagnosis, prognosis, and treatment since the manual assessment of physicians is subjective in nature. Recent rapid advances of machine learning tools, especially deep learning, increase the power of computer-aided diagnosis signifcantly [32]. Terefore, the researchers are moving to diagnostic systems with medical imaging using machine learning technology because of its promises on testing results and severity analysis. Contextually, healthcare data visualization is of great importance for proper analysis, interpretation, and accurate prediction by highlighting the patterns, characteristics, and correlations. Terefore, in this big data and deep learning realm, the healthcare researchers and industries are emphasizing visual analysis of data in order to maximize the efciencies of data-driven decisions and services.
In this study, chest CT scan images are used for COVID-19 detection due to its higher sensitivity than RT-PCR testing as demonstrated in [33]. Te explicit form of the lungs and the presence of high rates of ground glass opacity (GGO) in COVID-19-infected lungs can be easily seen by CTscan images. Considering the proven extraordinary performance of recent deep learning techniques in computer-aided detection, we have employed a deep learning technique in CT images for detection of COVID-19 cases. Deep learning (DL) is just a class of machine learning (ML) in which multiple hidden layers are incorporated into the model to extract more complex features from input raw data. Nowadays, DL techniques have been successfully implemented in various felds such as image processing, image recognition and verifcation, network security, medical imaging, and healthcare.
Tere are lots of well-established deep convolutional neural network (CNN) architectures such as VGG16, ResNet-50, InceptionV3, and EfcientNet for object detection and classifcation tasks using images as input data. Last two years during the pandemic, various research works were conducted based on the DL method for COVID-19 classifcation, but to the best our knowledge the deformable CNN have not been used yet in this area. In this study, we have proposed two separate deformable deep convolutional networks considering the conventional CNN and the state-of-the-art ResNet-50 for COVID-19 detection from chest CT with a strategic emphasis on fnding the impact of the deformable concept through a comparative performance analysis among the normal and deformable forms. Te gradient class activation mapping (Grad-CAM) technique has been used to visualize and check the targeted regions' localizing efort at the fnal convolutional layer of the models. Te main contributions of this work are as follows: (i) Designing deformable convolutional neural network models in order to detect COVID-19 cases from chest CT images based on the conventional CNN and ResNet-50 architectures. (ii) Tuning the model to achieve superior performance and consequently training and validating it with a balanced dataset of COVID-19 chest CT images. (iii) Visual inspection of the localization capability of the convolution layers through Grad-CAM. (iv) Performance evaluation and inspection of the impact of deformable layers of the proposed models as well as comparative analysis with the related stateof-the-art techniques.
Te rest of this article is organized as follows: A literature review of recent work is given in the related works section. Te methodology section explains the proposed methodology as well as the model evaluation process. Te next subsection presents the dataset used in this work, and the experimental result analysis has been explained in the experimental results and discussions section. Finally, the conclusion section concludes the whole research work.

Related Works
Numerous research works have been performed to diagnose COVID-19 from chest CT scan and X-ray images using ML and DL techniques. Tis section presents some recent studies related to COVID-19 detection from CT images applying DL techniques. A nine-layer tailored deep CNN model is proposed in [4] for COVID-19 screening using both CT and CXR images. Tey found the overall accuracy of 96.28% using a small dataset. Yasar and Ceylan [6] also proposed a deep CNN model with 23 layers, and it achieved the highest accuracy of 95.99%. Loey et al. [7] examined diferent well-known deep CNN architectures such as AlexNet, VGGNet16, and ResNet-50, utilizing the transfer learning technique for COVID-19 diagnosis using CT images. Tis work showed ResNet-50 can predict better than others with a test accuracy of 82.91%. Some work has been performed for segmentation as well as detection of COVID-19 using CT images in [8,9], and they achieved accuracy of 94% and 94.67%, respectively. Ni et al. [8] proposed a combination of 3D U-Net and MVP-Net based architectures, whereas Amyar et al. [9] presented a method of the multitask learning architecture with an encoder and decoder system.
Singh et al. [10] designed a multiobjective diferential mode-based CNN method for classifcation of COVID-19 cases, and their accuracy level is less than 93.5%. A machinedriven design exploration strategy-based deep CNN model is proposed for COVID-19 diagnosis in [11]. Wang et al. [12] designed a model by coupling two 3D U-Net architectures together for COVID-19 screening in CT images, and their classifcation accuracy reached 93.3%. A weakly supervised network is designed using the architecture ResNext+ along with the bidirectional LSTM blocks for prediction of COVID-19 cases from volume and slice-level CT images in [13].
Ensemble learning is now becoming a popular technique because of its higher precision and accuracy instead of using a single model. Several studies implemented an ensemble of transfer learning using diferent pretrained deep neural network architectures such as VGG, Xception, and ResNet for screening COVID-19 cases. Aversano et al. [14] exploit the transfer learning technique by using pretrained models such as VGG, Xception, and ResNet individually and then combining to have an ensemble model. Teir experiment shows the value of F1score ranges from 0.94 to 0.95. Gifani et al. [15] used 15 pretrained standard CNN models to build an ensemble architecture with the majority voting rule with experimental results showing the overall detection accuracy of 85.4%. Biswas et al. [16] also proposed an ensemble of deep transfer learning using VGG16, ResNet-50, and Xception models for CT image classifcation with good accuracy. In our previous study, we have developed an ensemble model for COVID-19 screening from CTimages, exploiting three deep CNN architectures in [17]. Te experimental results achieved the accuracy of 96% and a sensitivity of 97% for CT scan image prediction.
After reviewing the above research works, it is concluded that the deep learning method can be employed for COVID-19 screening purposes though there were some limitations such as imbalanced datasets and high rates of false prediction. So there is still a scope to improve the prediction accuracy more as well as the robustness of the methods that can minimize the false positive and false negative rates. In this work, we proposed a deep learning approach for COVID-19 detection using CT images. A deformable technique is implemented in the standard ResNet-50 architecture to make the model more robust by replacing a few layers of ResNet-50 with its deformable parts to achieve the good prediction performance.

Methodology
Tis section covers mainly three parts of the methodology for COVID-19 detection: (a) describing the idea of the deformable CNN, (b) explaining the proposed framework using the deformable concept, and (c) mentioning diferent evaluation criteria to validate the proposed framework. Description of the CT scan dataset used in this researh is provided at the end of this section.

Deformable CNN.
Te standard CNNs are limited in their ability to model complex geometric transformations due to their fxed geometric composition of modules. Te convolution kernel selects the samples at fxed spatial location, and the pooling layer reduces the spatial resolution at a constant ratio in regular CNN modules. As a consequence, it reduces the effectiveness of models for complex transformation. So the adaptive determination of sampling locations or deformed kernels based on the objects is required for exact visual recognition. In this regard, Dai et al. [34] introduced a new approach of deformable convolutional neural networks which was done at Microsoft Research Asia in 2017. Tey introduced two new modules to enhance the capability of transformation modeling: deformable convolution and deformable ROI pooling. Deformable convolution adds a 2D ofset to sampling locations of regular convolution grids to deform the kernel in an adaptive manner based on the required objects.
Let a convolutional kernel of S sampling locations, w i and l i denote the weight and ofset for the i-th location of the kernel, respectively. Ten, y(l) denoting the output features from the input feature x(l) at location l is calculated as follows: For deformable convolution, equation (1) will be where the standard grid of S sampling locations is augmented with ofsets ∆l i which is a learnable ofset. As l + l i + ∆l i is now fractional, bilinear interpolation is used to calculate x(l + l i + ∆l i ) in equation (2) [34]. Te kernel geometric structure of the deformable convolution system is illustrated in Figure 1.
Te ofsets for kernel deformation are obtained by standard back-propagation of the gradients with the bilinear interpolation operations during training of the model. An additional convolution layer is used to learn the ofset values shown in Figure 2. As a consequence, a small amount of parameters is added to the model for ofset learning. In another study [35], it is proved that the performance can be enhanced by stacking more deformable layers in standard CNN architectures. So taking these benefts of deformable CNNs, we employed this idea for COVID-19 detection.

Proposed Model.
A deformable convolution concept is utilized in this work for the detection of COVID-19 cases from chest CT images. We have designed two separate deformable deep convolutional networks considering the conventional CNN and the state-of-the-art ResNet-50 for the detection task. Te strategic emphasis is to observe the infuence of the deformable concept through a comparative performance analysis between the normal and deformable forms. Initially, a ffteenlayered deep CNN model is developed, and then its deformable form is created. Deformable form of this normal CNN model is made by replacing two convolution layers with deformable convolution layers. Te detailed layers and parameter information of both normal and deformable CNN models are shown in Table 1. Before selecting this ffteen-layered model, we have experimented with various architectures by tuning different parameters of the models and also the position of deformable layers to fnd the best performance. Ten, this ffteenlayered structure is chosen for COVID-19 detection in CT images on the basis of the maximum performance. It is seen that the total number of parameters of the deformable model is greater than the normal CNN model as some extra parameters are needed for ofsets learning in the deformable convolution. Every convolution layer uses the ReLU activation function except the fnal dense layer that uses softmax activation for binary classifcation.
Te overftting and underftting problems are the common problems inducing in the deep learning model. Tese problems are also addressed carefully in our experiments. Te dropout layer with a drop rate of 0.4 is used in each model to diminish the overftting problems. A large dataset is used to train the models to overcome the underftting problems. Also, the number of layers in the models and training epochs is increased after tuning the models to solve the underftting problems. Te performances of these models are presented in the results section.
To make the COVID-19 classifcation task more robust and efective, we proposed a state-of-the-art CNN architecture, ResNet-50, with its deformable format which is shown in Figure 3. It contains fve convolutional stages followed by a fnal fully connected dense layer for classifcation. Stages 2 to 5 have uniform convolutional (ConvBlock) and identity blocks (ID_Block) in the regular ResNet-50. Each convolutional and identity block contains a skip connection which is frst introduced in the ResNet model and is the main strength of the ResNet architecture. Two of the standard Conv2D layers in the second stage convolutional block of ResNet-50 are replaced by the corresponding deformable convolution layers (Deform_-Conv2D) to form a deformable convolutional block (Deform_ConvBlock). Te detailed architectures of each block are also presented in Figure 3. Ten, it is formed as the deformable CNN or deformable ResNet-50.
Te ResNet-50 architecture is selected in the CT image classifcation task due to its notable performance that is proved by the diferent state-of-the-art medical imaging research [7]. Due to its skip connection, it is easy to train the deep network, and the deeper the network, the more suitable it is for medical image classifcation. Te ResNet architectures have the capability to solve the vanishing gradient problems due to their identity mapping systems. So, this robust ResNet-50 model can be efectively used for COVID-19 screening. In this work, this ResNet-50 model is created from the scratch as its defned architecture; no pretrained weights are used for classifcation.    Journal of Healthcare Engineering Te positions of the deform layers are fxed after extensive tuning of the model for best performance. Te ReLU function is used as activation in each layer except the fnal layer which uses softmax activation for binary prediction. As stated, additional parameters are needed in the deformable CNN model to learn the ofsets of the kernel's deformed position. So the proposed deformable ResNet-50 model requires more parameters than the regular ResNet-50 model. Te total number of parameters in the proposed model is 23,771,906, whereas the regular ResNet-50 model contains 23,591,810 parameters. Te extra 180,096 parameters actually used for deformation learning tasks in the proposed model make it more robust and stronger. Hence, the proposed method presented in Figure 3 can be one of the most efcient ways of COVID-19 screening using lung CT images.

Evaluation Criteria.
Te commonly used assessment metrics for DL classifcation models are utilized to assess the proposed methodology. Te metrics are accuracy, specifcity, sensitivity, f1-score, and precision measured in terms of true and false prediction values. As only accuracy metrics cannot show the efectiveness of deep learning models for classifcation, various ways of assessment are used in this study.
Besides these metrics measurement, the accuracy and loss curves with the number of epochs have also been analyzed for performance evaluation. Equations (3)-(7) represent the defnitions of accuracy (Acc), specifcity (S p ), sensitivity (S n ), f1-score (F s ), and precision (P r ), respectively.
where the true positive (tp) and true negative (tn) denote the value of correct predictions of actual COVID positive patients and non-COVID patients, respectively. False positive

Journal of Healthcare Engineering
(fp) and false negative (fn) denote the value of incorrect predictions of COVID positive and negative patients, respectively. Te confusion matrix is also utilized to show the value of true and false predictions in a comfortable way of visualization which is shown in Figure 4.

Dataset Description.
CT scan images have a detailed and clear view of the lungs as compared to CXR images. So it is a very convenient way to diagnose the COVID-19 disease using CT images. A chest CT scan dataset is collected from the kaggle dataset repository for this experiment of COVID-19 diagnosis. Te CT images of this dataset have been collected from diferent real patients in hospitals from Sao Paulo, Brazil [36]. It contains a total of 2481 CT images, including 1252 images for COVID-positive patients and 1229 images for non-COVID cases with other lung diseases. Te main symptom of COVID-19 in CT is the two-sided existence of irregular ground glass opacities (GGOs) that may merge into dense and consolidative lesions beneath the pleura and along the bronchovascular networks. Te number and area of the lesions increase with the disease's progression. Furthermore, beside the GGOs patterns such as interstitial widening, crazy-paving pattern, halo and reversed halo patterns, airway and vascular modifcations are also found in CT for COVID-19 cases [37]. Few sample COVID and non-COVID CT slices from this dataset are shown in Figure 5. Te collected CT dataset is almost a balanced dataset which is an important factor of the model learning phase in the deep CNN. An imbalanced dataset may mislead the output prediction in deep learning classifcation tasks. In this experiment, no preprocessing techniques are applied due to irregular opacifcation present in CT images of pulmonary diseases. So raw CT scans are used for COVID-19 detection purposes because preprocessing can cause the loss of actual sensitive information about the texture of the infected region.

Experimental Results and Discussions
All the experiments were performed on the Google colaboratory platform using the Keras and Tensor Flow libraries. Te programs were run on GPU with 12.69 GB RAM and 107.72 GB Disk provided by Python-3 Google compute engine backend. In total, four experiments were performed in this study, consisting of a ffteen-layered CNN with its normal as well as deformable form and a ResNet-50 model with its normal as well as deformable form for COVID-19 screening.
Both the normal and deformable ffteen-layered CNN models are trained and validated using the collected COVID-19 CT dataset with input shapes of 150 × 150 × 3. Te dropout rate of 0.4 is used in the dropout layer of both confgurations. Te Adam optimizer with the learning rate of 0.001 and the categorical cross-entropy loss function are employed to compile the models. Te number of epochs and other hyperparameters are tuned for the best learning process. Finally, the number of epochs is selected as 60. Te train-valid-test splitting ratio is used as 80 : 10 : 10, and Figure 6 shows the learning curves of both the normal and deformable CNN.
Te erratic nature is seen from the model accuracy curve due to the raw CT images of random passing to the models shown in Figure 6. In the training phase, callback is utilized for saving the best model with the highest accuracy. Te training accuracy reached 91.8% and 90.3% in the normal and deformable CNN, respectively. Ten, the models are saved with validation accuracies of 90.7% and 91.9% for the normal and deformable CNN, respectively. Finally, the models are tested independently with a test dataset which is 10% of the main dataset splitted initially. Te test accuracy of 92.4% and 93.2% have been found in the normal and deformable CNN, respectively. Te confusion matrixes are exhibited in Figure 7 for the analysis of true and false predictions. It is expected that the deformable model can minimize the overall false prediction. So, from Figure 7 the overall false prediction value is reduced in the deformable CNN model. Tis experiment shows the deformable CNN can outperform the regular CNN.
Ten, the state-of-the-art CNN model, ResNet-50, has been selected for this experiment of COVID-19 screening. Primarily, the whole model has been developed from scratch, according to its original architecture. Ten, its deformable form is created as mentioned in the proposed model subsection. All the parameters of both models (normal ResNet-50 and proposed deformable ResNet-50) are trained through the collected CT dataset; no transfer learning technique is employed here. Te training dataset has been selected for learning the model with input shapes of 64 × 64 × 3. Te hyperparameters are selected to the standard value after various tuning processes addressing overftting and underftting problems. Te Adam optimizer and categorical cross entropy loss function are used to compile the normal ResNet-50 and deformable ResNet-50 models. Te learning curves for both normal and deformable ResNet-50 models are shown in Figure 8. Tough a sudden abrupt shifting is seen in the learning phase of models as in Figure 8(b), the callback function is used in these experiments to get the best model with higher accuracy. Te training accuracy in the proposed deformable ResNet-50 model reached 99.5%. Te ratio between validation and test     Figure 9. Te total number of false predictions is reduced from 8 to 6 in deformable ResNet-50. From this confusion matrix, it is clear that the proposed deformable ResNet-50 model is more robust and strengthened than its regular form. Table 2 represents the overall test results of four experiments in this study. According to Table 2, the accuracy of deformable experiments has superior results as compared to their base counterparts. Each of the four models has been tested with a single CT image by loading the model with trained weights. All the experiments can give the appropriate prediction result within a few milliseconds by inputting a single CT scan image.
Computation time is an important factor for model performance analysis and for any diagnosis system. In this regard, we have calculated the CPU times required for a single CT image prediction in all our experiments. CPU times depend on the input image shape. For the frst two experiments (the normal and deformable ffteen-layered CNN), input image shapes were 150 × 150 × 3 and then the regular and proposed deformable ResNet-50 took input shapes of 64 × 64 × 3. Te normal and deformable ffteen-layered CNN take CPU times of 46.7 ms and 71.6 ms, respectively, for the prediction of a single CT image. Tis time includes image loading and resizing according to the model's input shape and then prediction. Ten, the normal ResNet-50 model and the proposed deformable ResNet-50 model take CPU times of 55.2 ms and 68.1 ms, respectively, for a single image. Hence, the deformable part takes little more time than its original form due to the extra parameters contains in deformable parts.
Te receiver operating characteristics (ROC) curve is a widely used graphical representation of classifer performance. Figure 10 illustrates the ROC curve for all experiments, including our proposed method. It shows the area under curve (AUC) values of all models. Te AUC is found to be 0.998 from the ROC curve for the proposed deformable ResNet-50 model, and it indicates the efectiveness of our proposed method for COVID-19 detection.
Te Grad-CAM visualization is a useful tool for differentiating the model learning capability in positive and negative cases using the heatmap view in the images [38]. It uses gradients of the fnal convolutional layer to distinguish the region of interest for a specifc class. Figure 11 shows the Grad-CAM view of (a) COVID images (Class 1) and (b) non-COVID images (Class 0) produced by the proposed method. In Figure 11(a), ground glass opacity and consolidation of the COVID-infected lungs are accurately highlighted by the green color that indicates the good sensitivity of the model. In Figure 11(b), no specifc opacity or consolidation is detected in CT images due to the negative cases, and it shows the dispersed green colors in the images. Terefore, viewing the Grad-CAM, it can be mentioned that the convolutions layer framework-based feature extractor of the proposed deformable ResNet-50 model is well supported as the classifer input.
In this study, we have also discussed comparative analysis with the related deep learning-based state-of-the-art works. Most of the time, the performance metrics of deep learning models depend on the size of the dataset used for training. So the articles that used the same dataset or one that was close to our employed dataset size as well as related deep CNN models were selected for appropriate comparison of performances. Table 3 presents the comparative analysis of the detection results with the recent works. As it is seen from Table 3, results of the proposed deformable ResNet-50 model outperform the related methods. It also shows very low false predictions as it has the model geometric deformation capability. So, it can be a reliable

. Conclusions
Te deployment of DL techniques in the various medical diagnosis systems is now growing worldwide, and it speeds up the early diagnosis system in healthcare environments. In this article, we have proposed a COVID-19 disease detection technique from chest CT using the deformable deep CNN. Diferent experiments were performed for better model selection. Te impact of the deformable concept has been examined through performance comparative analysis among the designed deformable and normal models, and it is found that the deformable models show better prediction results than their normal form. Extensive analysis shows that the proposed deformable ResNet-50 model performs satisfactorily with an accuracy of 97.6% compared with the state-of-the-art techniques. Te Grad-CAM visualization evidences of the targeted regions' localizing tendency at the fnal convolutional layer is also found noteworthy. In the future, more diverse and critical CT datasets will be utilized for training to boost the robustness of the model. Finally, this study showed that the proposed method can be useful for efective COVID-19 detection as a substitute for RT-PCR with time and availability limitations.